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ESTIMA.TION  OF  PARAMETERS  OF  ZERO-ONE  PROCESSES 


BY  INTERVAL  SAMPLING:  AN  ADAPTIVE  STRATEGY 

■by 

M.  Brown,  H.  Solomon  and  M.  A.  Stephens 
1.  INTRODUCTION 

In  a previous  paper  (Brown,  Solomon  and  Stephens,  hereafter, 
referred  to  as  Part  l)  the  authors  considered  the  problem  of  estimating 
the  parameters  of  a zero-one  double  Poisson  process  when  sampling  was 
permitted  only  at  eq.ual  intervals.  Specifically,  the  time  t instate 
zero  is  distributed  with  distribution  F(t)  = 1-exp (-Xt)  , t>0  , and  that 
instate  1 with  distribution  F(t)  = l-exp(-tit)  , t>0  ; sampling  takes 
place  at  time  intervals  A,  so  that  a sequence  of  zeros  and  ones  is 
obtained,  and  it  is  required  to  estimate  X and  y.  Two  procedures  were 
investigated.  In  Procedure  1 a fixed  number  of  observations  was  taken; 
this  has  the  drawback  that  one  may  see  no  change  of  state.  In  procedure  2 
one  observes  till  a fixed  number,  of  cycles  is  seen.  Suppose  the  original 
observation  is  a zero;  a cycle  is  a sequence  of  zeros  followed  by  a 
sequence  of  ones,  observed  tilla  new  zero  shows  that  a new  cycle  is 
beginning. 

For  either  procedure  it  is  clear  that  properties  of  the  maximum 
likelihood  estimates,  which  were  given  for  procedures  1 and  2,  depend 
greatly  on  X*  = XA  and  ]X*  = yA.  Graphs  were  given,  based  on  extensive  Monte 
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Carlo  studies,  to  illustrate  this  dependence.  A line  was  drawn  on  the 
graph,  giving  the  experimental  values  of  against  X*,  and  also  lines 

one  standard  deviation  on  either  side  were  drawn  to  enable  confidence 
intervals  to  he  calculated.  This  wa.s  done  for  graphs  illustrating  both 
Procedures  1 and  2,  for  variotis  values  of  the  ratio.  p/A. 

Two  conclusions  from  these  graphs  were  the  following: 

(a)  if  the  true  X*  (or  y*)  were  any  value  greater  than  1,  the  expectation 
of  the  estimate  a (or  y ) would  be  approximately  1;  thus  working  backwards, 
an  apparent  estimate  near  1 gave  no  indication  of  the  true  value. 

« A* 

(b)  reasonable  confidence  limits  for  X could  be  found  from  X only  as 
X became  small,  say  roughly  less  than  0.4. 

In  this  paper  we  investigate  an  adaptive  strategy  suggested  by  these  two 
conclusions.  We  begin  with  a suitable  A using  Procedure  2,  and,  if  estimates 
X and  y are  too  large,  we  successively  halve  A until  they  are  both  small 
enough.  Then  all  the  samples,  obtained  at  each  stage  of  this  technique, 
are  combined  together  to  give  estimates  of  X and  y.  This  involves  numerical 
solution  of  the  maximum  likelihood  equations.  We  show,  again  from  Monte 
Carlo  studies,  that  for  a reasonable  number  of  cycles  at  each  stage,  not 
only  is  bias  almost  totally  removed,  but  also  estimates  and' confidence 
intervals  can  be  found  for  much  higher  starting  values  (i.e,  those  at  Stage  l) 
of  X and  y than  for  the  one-sample  strategy. 

For  convenience,  we  now  recall  the  estimation  formulas  of  Procedure  2. 
Suppose  at,  say.  Stage  1,  a sample  of  n cycles  is  taken,  with  sampling 
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be  the  probability  of  event  ' E that  the  state 


interval  A.  Let 

of  the  process  changes  from  zero  at  the  beginning  of  one  interval  to 

one  at  the  end;  depends  on  X,  U,  and  A.  Let  be  the 

number  of  events  E observed  in  the  data.  Similarly  let  Pqq5 

p^Q,  and  p^^  be  defined,  and  also  ^2.0’  ^1“ 

Pqq  = 1-Pq2_  = I-Pjlq-  Finally,  let  X.  be  the  length  of 

the  i-th  string  of  zeros,  and  let  be  the  length  of  the  i~th  string 

of  ones,  and  suppose  x = E^x^/n  and  y = E^y^/n. 

Then  (Lemma  2 of  Part  l)  estimates  of  p^^  and  are 

01  10 

Pqi  ""  and  p^^  = l/y.  Suppose  S = p^^  PxO’  ^ 1 

maximum  likelihood  estimates  of  X*  = XA,  and  of  y*  = yA  do  not 
exist.  If  S < 1,  the  estimates  are 

(1)  X*  = {ln(l-S)}/S 

y*  ~ “Ppo  ^in(i-s)}/s 

In  the  next  section  we  set  forth  the  proposed  adaptive  strategy. 
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2.  M ADAPTIVE  STRATEGY 


(a)  If  any  estimates,  say  Xq  and  Pq,  exist  of  X and  y,  choose  A to 

be  A^,  so  that  XqA^  and  yQ^2.  both  less  than  a constant  C.  This  constant 
■will  be  decided  by  the  narrowness  of  the  confidence  intervals  required; 
the  graphs  of  Part  1 suggest  it  should  almost  certainly  be  less  than  OA. 

(b)  Suppose  X*  = XA^  and  y*  = yA^.  Sample  from  the  process  till  n complete 
cycles  have  been  observed  (Stage  l)  and  estimate  X*  and  y*  by  X*  and  0*, 

JL  Ju.  X ' 

given  by  the  formulae  (l)  above.  If  these  are  both  below  C,  accept  them. 

The  graphs  of  Part  1 can  then  be  used  to  improve  the  point  estimates  or  to 

* * 

provide  confidence  intervals  for  X^  and  y^  , and  the  estimates  of  X and 
y are  given  by  dividing  by  A^. 

(c)  If  an  estimate  or  y^  is  too  large,  or  if  they  do  not  exist,  proceed  to 

Stage  2.  Let  A be  Ag  = take  a second  sample  of  n cycles.  Use  the 

Stage  2 sample  only,  to  estimate  Xg  = XAg  and  Pg  = pA^j  and  if  both  are  less 
than  C,  combine  the  Stage  1 and  Stage  2 samples  to  obtain  final  estimates 

of  X and  y,  in  the  manner  to  be  sho'wn  below  in  Section  (e). 

(d;  If  Xg,  y^  are  not  less  than  C,  take  a Stage  3 sample  of  n cycles, 

with  interval  = aA^.  Repeat  this  procedure,  until  the  estimates, 

using  the  Stage  m sample  only,  of  X =XA  and  of  y*  = pA  , (where  A -A 

® ^ •'’mm  m m m ' 


4 


are  both  less  than  C.  Then  combine  all  the  samples,  from  all  stages, 
to  obtain  final  estimates  of  X and  y. 

Ce)  Before  preceding  with  later  steps  in  the  operating  procedure 
We  show  how  these  final  estimates  X and  y are  to  be  found.  To  simplify 
the  notation,  let  p^^  be  p and  let  p^Q  be  r;  then  p^^  = 1-p  and  p^^  = 1-r. 
Further,  let  k,  1,  v,  w respectively.  Finally, 

suppose  for  the  moment  = 1,  so  = 1/2'^  At  Stage  j,  the  logarithm 
of  the  likelihood  is 

L.  = k.  In  "D . +1.  In  (l-p.)  + v.  In  r.  + w.  In  (l-r.)  where  the  subscript 
j denotes  the  value  for  Stage  j.  The  overall  log-likelihood,  for 
Stages  1 to  m is 

m 

L = Z L.  , 

0=1  ^ 

In  these  expressions,  we  have 


p = T and  r = T 

X+y  X+y 


where  S.  = exp  {-(X+y)  A.},  and  T = 1-S.. 

0 0 0 0 

We  wish  to  maximize  the  log-likelihood;  because  of  the  changing 

A.  it  is  not  now  possible  to  work  with  the  probabilities  p and  r , 
J J J 

as  was  done  in  Lemmas  1 and  2 of  Part  1,  but  we  must  find  the 
derivatives  of  L with  respect  to  X and  y.  These  in  turn  require  the 
derivatives  of  v.  and  r^.  Let  Y = X + y;  then  it  may  be  shown  that 

*0  J 
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and  corresponding  expressions  can  easily  "be  deduced  for  the 


derivatives  of 


Returning  to  the  derivatives  of  L,  we  then  have 


and 


Hi 

3y 


m 3L, 

y- 

j=i ’t* 


l-P. 


6 


and  maxim-urn  likelihood  estimates  of  X and  y are  given  hy 


(2) 


SL  - , 3L'  - 

Trr  - 0 and  ^ = 0 

3X  3y 


These  simultaneous  equations  can  he  solved  hy  iteration,  using  as  starting 
values  the  estimates  of  X,  y obtained  from  the  final  stage  m.  The 
procedure  is  as  follows,  using  Hewton’s  method  for  two  variables  to 


3Lj 

generate  the  iteration.  Suppose 
so  that  (2)  becomes 


Z^CX,y)  and  ~=Z2(X,y) 


(3) 


Z^(X;y)  =0  ; Z2(X,y)  = 0 . 


We  have 


where  - 


and 


3Z^  m ^ ^ m ^ 

A A ’ 


ax 


!!ii  = 
ax 


r-k 
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a2 
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JH-ii 

2ll3i  I Ipj  1-pJ 


r-v 


•+ 


far 


2 x2  ax 


Wli- 


a2 

a r. 


3p 


' V 
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-V 
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Corresponding  expressions  can  easily  "be  found  for  the  derivatives  of 
with  respect  to  X and  p • 'Shen  suppose  X^  and  are  estimates 

of  X and  p , and  let  s — X Xq  and  t = p — p^  . Approximations 
to  s and  t are  given  hy 


ih) 


3Z 

&Z,  3Z_ 
X d 


3Z 

3T 

8Z3_  azg 


t = 


2 3X 


- z 


3Z. 

1 W 


3Z^  3Zg 


3X  3p  3p  3X 


3Z^  aZg 


3p  3X  3X  3p 


where  the  Z^,  and  the  partial  derivatives  are  all  evaluated  at  X^ 

A. 

and  Pq  . Then  new  solutions  of  (3)  are  \ “ ^0  P =*  Mg 

these  values  now  become  X^  and  p^  , and  the  above  process  is  repeated 
from  equations  (4),  to  give  successive  sets  of  solutions  until  two  sets  are 
close  enough  to  be  regarded  as  accurate.  A FORTRAN  computer  program  is 
available  from  one  of  us  (M.A.S.)  to  perform  these  calculations.  We 
now  return  to  the  steps  in  the  basic  strategy. 

(f)  A reasonable  figure  for  n appears  to  be  5-  If  n = 5»  use  the 
estimates  to  decide  roughly  the  ratio  p/X  = r say.  Figures  1,  2,  and  3 
then  give  the  graph  of  X versus  X,  for  r = 1,  2 and  4;  also  lines 
are  drawn  one  standard  deviation  on  either  side,  from  which  confidence 
intervals  for  X can  be  fo\ind.  For  other  values  of  r,  intervals 
must  be  found  by  interpolation.  The  graphs  may  also  be  used  to  find 

A. 

intervals  for  p if  p appears  on  the  y-axis,  otherwise  the  intervals  for 
X must  be  used,  with  the  estimated  multiplier  r. 
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Example  1.  For  example,  suppose  ve  start  with  = 1,  and  after  four 
stages  and  are  hoth  less  than  C.  Let  the  final  estimates  then 

he  X = 1.28  and  ja  = 0.96.  Now  p is  approximately  equal  to  X,  so 
from  Figure  1,  entering  at  X = 1.28  we  have  confidence  interval 
0.9T  < X < 1.76,  and  entering  at  O.96  we  have  0.72  < y < 1.32. 

Example  2.  Suppose  now,  again  after  4 stages,  the  final  estimates  are 

A. 

X = 1.28  and  p = 2.88.  Figixre  2 will  he  used,  since  p is  approximately 
2X.  The  confidence  interval  for  X is  0.95  < X < 1.73,  hardly  different 
from  that  in  Example  1.  Since  2.88  cannot  he  entered  on  the  vertical  axis 
the  confidence  interval  for  p can  he  found  hy  using  that  for  X and 
m’oltiplying  hy  f = 0/^  = 2.25.  The  interval  is  then  2.l4  < p < 3.89. 

These  intervals  are  roughly  66^  intervals,  based  on  a normal  distri- 
bution for  X and  0,  and  this  is  an  approximation.  Other  procedures,  to 
find  a Joint  confidence  zone  for  hoth  parameters,  will  he  discussed  in 
Section  3,  after  the  results  of  the  Monte  Carlo  studies  have  heen  presented 
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3 TESTING  THE  STRATEGY 


The  strategy  was  tested  by  choosing  X and  y,  with  ratios 
r = y/X  = 1,  1.5»  2 and  U,  and  then  following  the  stages  above  for  C = O.U. 
The  final  estimates  X and  y were  recorded,  the  number  of  stages 
required,  and  the  total  nvimber  of  observations  n^  needed  to  make  an  estimate. 
This  was  repeated  for  1000  Monte  Carlo  samples  or  more,  and  the  mean  of 
the  estimates,  called  X,  is  plotted  against  X in  the  Figures.  The  two 
other  lines  are  as  described  above,  one  standard  deviation  on  either  side 
of  the  mean  line.  The  standard  deviation  is  estimated  from  the  Monte  Carlo 
samples.  Values  of  n^  are  included  in  the  figures. 

Comments:  Throughout  these  comments  we  assme  that  X is  the  smaller  of 

the  two  estimates. 

(a)  The  first  remarkable  result  is  that,  for  n = 5,  X is  almost  equal 
to  X over  the  whole  range  covered;  it  is  biased  upward,  but  not  strongly  so, 
as  it  was  in  the  one-stage  sampling  of  Part  1.  This  is  true  also  for  the 
estimate  fl,  even  when  y is  four  .times  X.  Values  of  b^  and  b^, 
the  skewness  and  kurtosis  parameters,  were  calculated  for  the  X 
ji  estimates  and  show  the  distributions  to  be  not  normal,  but  somewhat  skew. 

Cb)  Fig'ure  U shows  the  estimate  of  cf(X)  a-nd  of  o(Ci),  obtained  from 
the  Monte  Carlo  samples.  The  standard  deviation  of  X appears  to  be 
practically  independent  of  r = y/X,  at  least  for  the  situations  considered 
here  (n  = 5,  r = 1,2,4,  0 < X < 2.5),  and  the  points  shown  are  those 
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for  y=X,  i.e.  r=l.  The  standard  deviations  of  y are  shown  for 
r = 2 and  U;  those  for  r = 1 are,  as  expected,  similar  to  those  for 
X.  Some  points  are  shown  for  double  runs  with  the  same  X;  they  give  an 
indication  of  the  variability  of  d(y).  The  graphs  suggest  that  a(X) 
and  a(y)  are  practically  linear  in  X,  with  a rather  strange  zero  effect 
at  low  X.  The  standard  deviations  of  y are  not  quite  r times  those  of 
X,  presiomably  because  of  correlation  between  the  estimates.  The  suggestion 
to  use  r to  obtain  confidence  intervals  for  y,  when  they  cannot  be 
directly  obtained  from  the  I’igures,  may  therefore  be  conservative. 


11 


(c)  The  graphs  show  that  confidence  intervals  for  X and  y,  even  though 
approximate,  can  now  he  found  for  X up  to,  say,  2.0,  a much  higher 
limit  than  was  availa,hle  with  the  one-stage  sampling  of  Problem  1. 


(d)  It  is  clear  that  n,  the  number  of  cycles  observed  per  stage,  will 
influence  the  results.  Figure  6 gives  the  graph  of  E(X)  against  X 
for  n = 2,  y = X.  The  estimates  have  a greater  bias  than  in  Figure  1. 
They  are  obtained  for  a smaller  average  number  of  observations,  but  the 
standard  deviation  is  then  greater  than  for  n = 5.  It  would  appear  that 
u = 5 is  a reasonable  number  of  cycles  per  stage. 


(e)  The  estimates  of  X and  y are  correlated,  and  estimates  of  the 
correlation  p were  foiind  from  the  Monte  Carlo  studies.  It  also  appears 
to  be  independent  of  ir,  and  the  curve  of  ^ against  X,  given  in 
Figure  5,  is  compiled  from  all  results  for  r = 1,  2,  and  k,  (n  = 5)* 

We  then  examined  whether  this  correlation  could  be  brought  into  the 
inference  procedures,  in  a way  similar  to  its  use  with  a bivariate  normal 
distribution*  Consider  the  statistic 


(X-X)' 


- 2p 


(X-X)  (ji-y)  (y-y)‘ 


h-p=) 


a 


a,  a 
X y 


Although  the  marginal  distributions  of  X 
still  true  that  Z will  be  approximately 


and  y are  skew,  it  might  be 
2 

X>2  distributed  as  it  would  be 
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if  X and  fi  were  bivariate-^normally  distributed  with  means  X,  y, 

2 2 

variances  0,0,  and  correlation  p.  This  was  examined  in  a second 
X’  y’ 

batch  of  Monte  Carlo  runs,  using  values  of  the  variances  and  correlations 
derived  from  the  first  Monte  Carlo  samples.  These  are  only  estimates 
(but  very  good  estimates;  they  are  based  on  1,000  samples,  and  then 
smoothed)  and  it  was  found  that  Z is  well  approximated,  by  ^ over 
the  ranges  of  X and  y considered. 

Tests  of  hypotheses  Thus  to  test  the  hypothesis  that  X = X^  and 

y = Pq,  we  find  the  estimates  X and  0 from  the  strategy;  obtain 

a,  , a , and  p from  Figures  4 and  5,  and  calculate  Z;  reject 

A y 

2 

Hq  if  Z is  greater  than  'the  appropriate  upper  tail  significance 

2 

point,  for  level  a,  of  the  X£  distribution.  Note  that  this  signifi- 
cance point  is  given  by  -2  In  a. 

Confidence  intervals  If  the  true  values  of  0^,0  and  p could  be 

A y 

inserted  in  Z,  a confidence  ellipse  could  be  found  for  X and  y, 

biased  on  the  estimates  X and  y,  by  setting  Z In  practice, 

of  course,  these  parameters  will  not  be  known;  the  estimates  % and  0 

could  be  used  to  estimate  them,  using  Figures  4 and  5j  but  so  many  new 

stochastic  elements  will  now  be  introduced  that  Z will  no  longer  be 
2 

distributed. 

2 

Nevertheless,  the  success  of  the  approximation  in  the  case  of 

known  variances  and  correlations  suggests  that  perhaps  a formula  like  Z 
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can  "be  found,  based  only  on  Xj  Os  3-nd  y,  -whicb  will  again  be 
2 

Xg  distributed  and  which  can  be  used  to  give  a confidence  zone. 

After  considerable  experimentation,  the  following  model  was 
determined : 

(1 ) Given  the  estimates  X and.  0 (X  < y;  “y  = O/X)^  replace 
by  a5^5  d by  arXs  P "by  l-exp(-bX),  and  then  calculate  Z. 

2 

(2)  Set  Z = obtain  a 100(l-a)i^  confidence  ellipse  for  X and  y. 

The  forms  of  the  expressions  above  were  suggested  by  Figures  k and 

5;  it  remained  to  find  values  a and  b by  experimentation,  to  give  Z an 
2 

approximate  X2  distribution.  For  n = 5 cycles,  good  results  were 
obtained,  for  X < 1.0  and  for  r up  to  4,  with  a = O.U  and  b = 2.5. 

The  distribution  of  Z was  tested  by  testing  whether  exp(-Z/2)  is  \miform 

between  0 and  1.  Note  that  the  a and  b are  not  the  values  which  would  give 

the  "best  fit"  to  Figures  U and  5;  they  work  because  the  various 

compensating  errors  in  the  model  give  a Z which  has  the  required  distribution. 

(f)  Final  Comments:  Figures  1 to  3,  apart  from  showing  the  results  of 

the  Monte  Carlo  studies,  are  useful  to  provide  immediate  approximate 
confidence  intervals  for  the  parameters.  Although  the  model  for  Z 
above  was  derived  empirically,  the  successful  approximation  of  exp(-Z/2) 
by  a uniform  random  variable  means  that  a confidence  ellipse  can  be 
drawn,  with  reliable  confidence  level  I00(l-a)5^  by  setting 
exp(-Z/2)  = a for  all  values  of  a.  Further,  the  ellipse  is  a confidence 


U 


zone  for  both  parameters  taken  together,  and  it  can  easily  be  drawn 

by  computer,  when  estimates  X and  have  been  found.  In  fact, 

if  a and  b can  be  provided  for  other  values  of  n,  over  reasonable 

2 

ranges  of  A and  y,  to  give  Z the  X 2 distribution,  the  technique 
of  estimation  and  the  provision  of  confidence  zones  will  depend  in 
a minimum  fashion  on  the  true  parameters,  and  can  essentially  be  fully 
automated.  Preliminary  investigation  for  n = 10  suggests  that  this 
is  possible.  It  should  be  worthwhile  to  investigate  the  technique 
f\irther,  to  see,  for  example,  how  n^  (the  nxamber  of  observations 
required),  and  the  size  of  the  confidence  ellipse,  depends  on  n. 
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